Med-EASi: Finely Annotated Dataset and Models for Controllable Simplification of Medical Texts

نویسندگان

چکیده

Automatic medical text simplification can assist providers with patient-friendly communication and make texts more accessible, thereby improving health literacy. But curating a quality corpus for this task requires the supervision of experts. In work, we present Med-EASi (Medical dataset Elaborative Abstractive Simplification), uniquely crowdsourced finely annotated supervised short texts. Its expert-layman-AI collaborative annotations facilitate controllability over by marking four kinds textual transformations: elaboration, replacement, deletion, insertion. To learn simplification, fine-tune T5-large different styles input-output combinations, leading to two control-free controllable versions model. We add types into using multi-angle training approach: position-aware, which uses in-place inputs outputs, position-agnostic, where model only knows contents be edited, but not their positions. Our results show that our fine-grained improve learning compared unannotated baseline. Furthermore, position-aware control enhances model's ability generate better than position-agnostic version. The data code are available at https://github.com/Chandrayee/CTRL-SIMP.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Error-Controllable Simplification of Point Cloud

Point cloud simplification has become a vital step in any point-based surface processing pipeline. This paper describes a fast and effective algorithm for point cloud simplification with feature preservation. First, feature points are extracted by thresholding curvatures; Second, for non-feature points, they are covered by distinct balls, the points in each ball are substituted by an optimized ...

متن کامل

Information Retrieval from Annotated Texts

Methods for the correct and eecient handling of annotations in a full-text retrieval system are investigated. The problem with annotations is that they cannot be treated as regular text, since this would disrupt proximity searches, but on the other hand, they cannot be ignored, as they may carry important information. Moreover, in some cases, a user may wish to restrict a search to prespeciied ...

متن کامل

Auto-colorization Exploiting Annotated Dataset

Colorization is a very challenging task for computers which requires very high performance of segmentation, object recognition, color understanding, etc., and even for humans, it’s very demanding and arduous work to fully accomplish. To achieve this task, there traditionally has been three approaches: example based model, scribble based model, and data-driven based model which introduced relati...

متن کامل

developing a pattern based on speech acts and language functions for developing materials for the course “ the study of islamic texts translation”

هدف پژوهش حاضر ارائه ی الگویی بر اساس کنش گفتار و کارکرد زبان برای تدوین مطالب درس "بررسی آثار ترجمه شده ی اسلامی" می باشد. در الگوی جدید، جهت تدوین مطالب بهتر و جذاب تر، بر خلاف کتاب-های موجود، از مدل های سطوح گفتارِ آستین (1962)، گروه بندی عملکردهای گفتارِ سرل (1976) و کارکرد زبانیِ هالیدی (1978) بهره جسته شده است. برای این منظور، 57 آیه ی شریفه، به صورت تصادفی از بخش-های مختلف قرآن انتخاب گردید...

15 صفحه اول

Principles for Learning Controllable TTS from Annotated and Latent Variation

For building flexible and appealing high-quality speech synthesisers, it is desirable to be able to accommodate and reproduce fine variations in vocal expression present in natural speech. Synthesisers can enable control over such output properties by adding adjustable control parameters in parallel to their text input. If not annotated in training data, the values of these control inputs can b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i12.26649